Sardinian on Facebook: Analysing Diatopic Varieties through Translated Lexical Lists
نویسندگان
چکیده
English. Presence of regional and minority languages over digital media is an indicator of their vitality. In this paper, we want to investigate quantitative aspects of the use on Facebook of the Sardinian language. In particular, we want to focus on the co-existence of diatopic varieties. We extracted linguistic data from public pages and, through the translation of the most frequent words, we find out similarities and differences between varieties. Italiano. La presenza e l’ uso delle lingue regionali e minoritarie sui mezzi digitali è un indicatore della loro vitalità. In questo lavoro vogliamo concentrarci sugli aspetti quantitativi del sardo usato su Facebook. In particolare, vogliamo analizzare le varietà diatopiche estraendo i dati linguistici dalle pagine pubbliche. Mediante la traduzione delle parole più frequenti abbiamo trovato similarità e differenze tra le
منابع مشابه
Diatopic Patterning of Croatian Varieties in the Adriatic Region
The calculation of aggregate linguistic distances can compensate for some of the drawbacks inherent to the isogloss bundling method used in traditional dialectology to identify dialect areas. Synchronic aggregate analysis can also point out differences with respect to a diachronically based classification of dialects. In this study the Levenshtein algorithm is applied for the first time to obta...
متن کاملCrowdsourcing Dialect Characterization through Twitter
We perform a large-scale analysis of language diatopic variation using geotagged microblogging datasets. By collecting all Twitter messages written in Spanish over more than two years, we build a corpus from which a carefully selected list of concepts allows us to characterize Spanish varieties on a global scale. A cluster analysis proves the existence of well defined macroregions sharing commo...
متن کاملAnalysis of Patent Abstracts
Text analysis involves the deconstruction of information within a text. This includes text structure, text pattern, linguistic features, lexical analysis, and syntactic analysis. This research took as its starting point the bottom-up approach of analysing the lexical features, syntactic features, and textual features of patent abstracts for comprehensive coverage of text analysis. Several tools...
متن کاملSocial Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency-based estimators of lexica...
متن کاملInvestigating Diatopic Variation in a Historical Corpus
This paper investigates diatopic variation in a historical corpus of German. Based on equivalent word forms from different language areas, replacement rules and mappings are derived which describe the relations between these word forms. These rules and mappings are then interpreted as reflections of morphological, phonological or graphemic variation. Based on sample rules and mappings, we show ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016